IUPAC ambiguities in DNA sequences
DNA BASER-The sequence assembler-Home pageFeatures and performancesScreen shotsPricesInfo and news.Download a full working versionContact us
dna sequence assembly definition
scf trace assembly

What is DNA sequence assembly?



 

 

 

Definition of DNA sequence assembly

 

DNA sequence assembly is a process through which short DNA sequence fragments (called reads or samples) are merged into a longer DNA sequence in the attempt to reconstruct the original DNA sequence.

 

The longer sequence resulted from sequence assembly is called a 'contig' sequence. During sequence assembly the short DNA fragments may also be aligned to a reference sequence in order to see the differences between the contig sequence obtained and the reference sequence.

 

The sequence assembly process is needed because modern DNA sequencing technology cannot read whole genomes in one step. Instead the sequencing machines are reading small pieces - around 30 bases for NextGen sequencers (shotgun sequencing) and 2000 bases for Sanger (old) sequencers. Technologies that can read up to 30000 bases are also available. Usually Sanger technology is the oldest but the most accurate technology. However, it has a great disadvantage: it is expensive which makes it inappropriate for sequencing whole genomes.

Sequences in random order
1. DNA sequences in random order


reverse complement of DNA sequences
2. DNA sequences after flipping them (reverse complement) in the correct order. (Observe the red and the purple sequence)


DNA sequence assembly done. Final contig sequence.
3. DNA sequences after assembly (contig)

 

To exemplify the complexity and the problems of the DNA sequence assembly process, imagine taking a big book and shredding its pages in a shredder then reconstructing the book by putting all fragments together in the correct order! Now imagine that the book has a lot of duplicate paragraphs and that the shredder instead of nicely cutting some pages totally destroys them.

 

DNA Sequence Assembler is the top DNA sequence assembler on the market. It incorporates several cutting edge technologies that are not available in any other sequence assemblers:

  • High-trust base caller
  • Automatic low quality regions trimming
  • Automatic reverse complement
  • Automatic recognition sequences (vector/primer) removal
  • Automatic mutation detection
  • Automatic contig ambiguity correction
  • Batch sequence assembly (it can automatically assemble millions of reads without human intervention)

 

 

 

 

 

 

 

DNA chromatogram assembly
contig assembly software